Concept Drift Detection with Hierarchical Hypothesis Testing | Proceedings of the 2017 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics

نویسندگان

  • Shujian Yu
  • Zubin Abraham
چکیده

When using statistical models (such as a classifier) in a streaming environment, there is often a need to detect and adapt to concept drifts to mitigate any deterioration in the model’s predictive performance over time. Unfortunately, the ability of popular concept drift approaches in detecting these drifts in the relationship of the response and predictor variable is often dependent on the distribution characteristics of the data streams, as well as its sensitivity on parameter tuning. This paper presents Hierarchical Linear Four Rates (HLFR), a framework that detects concept drifts for different data stream distributions (including imbalanced data) by leveraging a hierarchical set of hypothesis tests in an online setting. The performance of HLFR is compared to benchmark approaches using both simulated and real-world datasets spanning the breadth of concept drift types. HLFR significantly outperforms benchmark approaches in terms of accuracy, G-mean, recall, delay in detection and adaptability across the various datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Network Alignment | Proceedings of the 2017 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics

A multimodal network encodes relationships between the same set of nodes in multiple settings, and network alignment is a powerful tool for transferring information and insight between a pair of networks. We propose a method for multimodal network alignment that computes a matrix which indicates the alignment, but produces the result as a lowrank factorization directly. We then propose new meth...

متن کامل

Contextual Time Series Change Detection | Proceedings of the 2013 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics

Time series data are common in a variety of fields ranging from economics to medicine and manufacturing. As a result, time series analysis and modeling has become an active research area in statistics and data mining. In this paper, we focus on a type of change we call contextual time series change (CTC) and propose a novel two-stage algorithm to address it. In contrast to traditional change de...

متن کامل

Organization of Gatekeeping and Mental Framework in the System of Representation and Hierarchical Relational Structures of the Modern Society

Critical discourse analysis as a type of social practice reveals how linguistic choices enable speakers to manipulate the realizations of agency and power in the representation of action.The present study examines the relationship between language and ideology and explores how such a relationship is represented in the analysis of spoken text and to show how declarative knowledge, beliefs, attit...

متن کامل

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

Proceedings of the IJCAI 2017 Workshop on Learning in the Presence of Class Imbalance and Concept Drift (LPCICD'17)

Data Mining is faced with new challenges. In emerging applications (like financial data, traffic TCP/IP, sensor networks, etc) data continuously flow eventually at high speed. The processes generating data evolve over time, and the concepts we are learning change. In this talk we present a one-pass classification algorithm able to detect and react to changes. We present a framework that identif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017